AITopics | latent space policy

Collaborating Authors

latent space policy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reviews: MAVEN: Multi-Agent Variational Exploration

Neural Information Processing SystemsJun-2-2025, 00:12:19 GMT

The Starcraft results also seem fine, but not so strong as it make it obvious that committed exploration is a crucial empirical improvement for QMIX - while MAVEN agents learn faster in 3s5z, the final performance looks the same; MAVEN agents seem to have less variability in final win rate on 5m_vs_6m; and QMIX actually seems to have better final performance on 10m_vs_11m. The results in figure 2 and 4 do however suggest that there may be scenarios where the advantage of MAVEN is higher. Minor comments: 1) line 64 and others: the subscript "qmix" should probably be wrapped in a "\text{}" 2) first eqn in section 3: inconsistency between using subscripts and superscripts, i.e. u_i and u i 3) line 81: perhaps better phrased as: "the *best* action of agent i..." 4) line 86: u_n i - u_ U i? 5) line 87: I was confused by what "the set of all possible such orderings over the action-values" means. Besides a degeneracy when some of the Q values are identical, isn't there only one valid ordering? Or are you just trying to cover that degeneracy? 6) Definition 1: perhaps add an intuitive explanation, e.g. "Intuitively, a Q-function is non-monotonic if the ordering of best actions for agent i can be affected by the other agents action choices at that time step."

latent space policy, maven, multi-agent variational exploration, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.44)

Add feedback

Latent Space Policies for Hierarchical Reinforcement Learning

Haarnoja, Tuomas, Hartikainen, Kristian, Abbeel, Pieter, Levine, Sergey

arXiv.org Machine LearningApr-9-2018

We address the problem of learning hierarchical deep neural network policies for reinforcement learning. Our aim is to design a hierarchical reinforcement learning algorithm that can construct hierarchical representations in bottom-up layerwise fashion. In contrast to methods that explicitly restrict or cripple lower layers of a hierarchy to force them to use higher-level modulating signals, each layer in our framework is trained to directly solve the task, but acquires a range of diverse strategies via a maximum entropy reinforcement learning objective. Each layer is also augmented with latent random variables, which are sampled from a prior distribution during the training of that layer. The maximum entropy objective causes these latent variables to be incorporated into the layer's policy, and the higher level layer can directly control the behavior of the lower layer through this latent space. Furthermore, by constraining the mapping from latent variables to actions to be invertible, higher layers retain full expressivity: neither the higher layers nor the lower layers are constrained in their behavior. Our experimental evaluation demonstrates that we can improve on the performance of single-layer policies on standard benchmark tasks simply by adding additional layers, and that our method can solve more complex sparse-reward tasks by learning higher-level policies on top of high-entropy skills optimized for simple low-level objectives.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Machine Learning

1804.02808

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback